5,177 research outputs found
Offline to Online Conversion
We consider the problem of converting offline estimators into an online
predictor or estimator with small extra regret. Formally this is the problem of
merging a collection of probability measures over strings of length 1,2,3,...
into a single probability measure over infinite sequences. We describe various
approaches and their pros and cons on various examples. As a side-result we
give an elementary non-heuristic purely combinatoric derivation of Turing's
famous estimator. Our main technical contribution is to determine the
computational complexity of online estimators with good guarantees in general.Comment: 20 LaTeX page
Relativistic jet models for the BL Lacertae object Mrk 421 during three epochs of observation
Coordinated observation of the nearby BL Lacertae object Mrk 421 obtained during May 1980, January 1984, and March 1984 are described. These observations give a time-frozen picture of the continuous spectrum of Mrk 421 at X-ray, ultraviolet, optical, and radio wavelengths. The observed spectra have been fitted to an inhomogeneous relativistic jet model. In general, the models reproduce the data well. Many of the observed differences during the three epochs can be attributed to variations in the opening angle of the jet and in the angle that the jet makes to the line of sight. The jet models obtained here are compared with the homogeneous, spherically symmetric, synchrotron self-Compton models for this source. The models are also compared with the relativistic jet models obtained for other active galactic nuclei
Self-Modification of Policy and Utility Function in Rational Agents
Any agent that is part of the environment it interacts with and has versatile
actuators (such as arms and fingers), will in principle have the ability to
self-modify -- for example by changing its own source code. As we continue to
create more and more intelligent agents, chances increase that they will learn
about this ability. The question is: will they want to use it? For example,
highly intelligent systems may find ways to change their goals to something
more easily achievable, thereby `escaping' the control of their designers. In
an important paper, Omohundro (2008) argued that goal preservation is a
fundamental drive of any intelligent system, since a goal is more likely to be
achieved if future versions of the agent strive towards the same goal. In this
paper, we formalise this argument in general reinforcement learning, and
explore situations where it fails. Our conclusion is that the self-modification
possibility is harmless if and only if the value function of the agent
anticipates the consequences of self-modifications and use the current utility
function when evaluating the future.Comment: Artificial General Intelligence (AGI) 201
Optimistic Agents are Asymptotically Optimal
We use optimism to introduce generic asymptotically optimal reinforcement
learning agents. They achieve, with an arbitrary finite or compact class of
environments, asymptotically optimal behavior. Furthermore, in the finite
deterministic case we provide finite error bounds.Comment: 13 LaTeX page
Mass Density Fluctuations in Quantum and Classical descriptions of Liquid Water
First principles molecular dynamics simulation protocol is established using
revised functional of Perdew-Burke-Ernzerhof (revPBE) in conjunction with
Grimme's third generation of dispersion (D3) correction to describe properties
of water at ambient conditions. This study also demonstrates the consistency of
the structure of water across both isobaric (NpT) and isothermal (NVT)
ensembles. Going beyond the standard structural benchmarks for liquid water, we
compute properties that are connected to both local structure and mass density
uctuations that are related to concepts of solvation and hydrophobicity. We
directly compare our revPBE results to the Becke-Lee-Yang-Parr (BLYP) plus
Grimme dispersion corrections (D2) and both the empirical fixed charged model
(SPC/E) and many body interaction potential model (MB-pol) to further our
understanding of how the computed properties herein depend on the form of the
interaction potential
- …